Schema-Based Compression of XML Data with Relax NG

نویسندگان

  • Christopher League
  • Kenjone Eng
چکیده

The extensible markup language XML has become indispensable in many areas, but a significant disadvantage is its size: tagging a set of data increases the space needed to store it, the bandwidth needed to transmit it, and the time needed to parse it. We present a new compression technique based on the document type, expressed as a Relax NG schema. Assuming the sender and receiver agree in advance on the document type, conforming documents can be transmitted extremely compactly. On several data sets with high tag density this technique compresses better than other known XML-aware compressors, including those that consider the document type.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

An Empirical Evaluation of Simple DTD-Conscious Compression Techniques

To avoid ambiguity, in this paper, the term “XML compression” is used in the first (and, we believe, original and most accurate) sense exclusively. We will compare our proposed techniques only with other approaches that address problem (1), not problems (2) or (3). Since XML markup often displays a high degree of redundancy, ordinary text compressors (gzip [7], bzip2 [15], etc.) are frequently ...

متن کامل

Convert Xml Schema To Relational Schema

You can create source models from your relational source schema data of your metadata to be required in order to convert datatypes or to interpret your metadata. As with Designer's JDBC, Salesforce and WSDL importers, the XML File. Editing and validation support for XML Schema, Relax NG, NVDL scripts, Browse, edit, or query using XQuery and SQL with native XML or relational oXygen includes a to...

متن کامل

Random XML sampling the Boltzmann way

The Extensible Markup Language (XML) is extensively used today, either to encode documents (like in XHTML) or to serialize structured data. The XML standard[4] only defines some basic syntax rules followed by well-formed documents. However applications often define a set of higher order syntactic rules that an XML document must respect to be considered as valid for the given application. A set ...

متن کامل

Towards static type checking of Web query language

This article reports on a research project investigating the following two complementary issues: (1) improving how the structure of XML and HTML can be specified, (2) using structure specification (of XML and HTML documents) for static type checking of Web (and Semantic Web) query programs. The first step towards this goal is to provide a schema language like DTD, XML Schema or Relax-NG with be...

متن کامل

The polymake XML File Format

We describe an XML file format for storing data from computations in algebra and geometry. We also present a formal specification based on a RELAX-NG schema.

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • JCP

دوره 2  شماره 

صفحات  -

تاریخ انتشار 2007